![]() | ![]() | ![]() | ![]() | ![]() |
Arabic is a complex script that is written from right to left. The characters have four basic forms according to the position that they take within the word: initial, middle, final, or isolated. Also, it uses ligatures, which are characters that consist of two or more characters. Most notably, the lam-alef ligature consists of lam (initial or medial form) with alef (final form).
In some encodings, such as EBCDIC420, the lam-alef ligature has a single code point whereas others, such as Windows 1256 or ISO-8859-6, encode the lam and alef characters as separate code points. When SAS transcodes data from EBCDIC420 to one of the other Arabic single-byte encodings, the lam-alef combined character cannot be mapped to a single code point, so a substitution character is placed in the data instead.
A new function, ANORM420, is being introduced in SAS® 9.4 TS1M1. It correctly maps the EBCDIC 420 lam-alef to two separate code points in your data. The function also has options that enable you to add spaces after code points representing the final form of a character and convert Arabic-Indic numerals to a digit.
The following example shows the ANORM420 function usage.
s1=C8E5C7E3C7D320A02020 s2=C8E5C7E3C7D3A0202020 s3=59CD57BC577740454040
Click the Hot Fix tab in this note to access the hot fix for this issue.
Product Family | Product | System | Product Release | SAS Release | ||
Reported | Fixed* | Reported | Fixed* | |||
SAS System | Base SAS | Z64 | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 |
Microsoft® Windows® for x64 | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Microsoft® Windows® for 64-Bit Itanium-based Systems | 9.4 | 9.4 TS1M0 | ||||
Microsoft Windows 8 Enterprise 32-bit | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Microsoft Windows 8 Enterprise x64 | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Microsoft Windows 8 Pro 32-bit | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Microsoft Windows 8 Pro x64 | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Microsoft Windows Server 2008 R2 | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Microsoft Windows Server 2008 for x64 | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Microsoft Windows Server 2012 Datacenter | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Microsoft Windows Server 2012 Std | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Windows 7 Enterprise x64 | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Windows 7 Professional x64 | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
64-bit Enabled AIX | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
64-bit Enabled HP-UX | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
64-bit Enabled Solaris | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
HP-UX IPF | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Linux for x64 | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 | ||
Solaris for x64 | 9.4 | 9.4_M1 | 9.4 TS1M0 | 9.4 TS1M1 |